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ABSTRACT 

This study examined the impact of e-book text-tracking design on 4 ,h graders' (10-year-old children's) learning of Chinese 
characters. The e-books used in this study were created with Adobe Flash CS 5.5 and Action Script 3.0. This study was 
guided by two main questions: 1) Is there any difference in learning achievement (Chinese character writing, lexical 
comprehension, and lexical usage) between groups with different e-book text-tracking designs? 2) Is there any 
difference in learning motivation (attention, confidence, relevance, satisfaction) between groups with different e-book 
text-tracking designs? This study was an experimental design where the independent variable was text-tracking design 
for e-books: word-based tracking or line-based tracking. A sample of forty-nine 4 th graders participated in the study and 
participants were randomly assigned into these two groups. They were asked to do a pre-test, and then they read their 
assigned e-books for forty minutes. After they finished reading, they were given a post-test and motivation survey. The 
result showed that students in the line-based tracking design group performed better in Chinese character writing and 
lexical comprehension. There was no significant difference in learning motivation between groups. This study hopes to 
contribute to e-book design principles for young learners and serve as a reference for elementary school teachers and 
e-book publishers. 
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INTRODUCTION 

The e-book is a new kind of reading resource for children. It 
has two main effects on early literacy learning. One is its 
effect on traditional adult-child storybook reading in the 
everyday lives of young children. The other effect is the 
cultivation of children's literacy concepts and skills in a 
digital age (Roskos, Brueck, & Widman, 2009). Many e- 
books for children are programmed to be interactive and 
include multimedia effects such as written text, oral 
reading, oral discourse, music and sound effects. The 
electronic format is better equipped than the paper book 
format to focus children's attention on text features. The 
superiority of the accompaniment of oral reading with 
written text is supported by Mayer's multimedia modality 
principle. This principle argues that learning will be 
enhanced if textual information is presented in an auditory 
format rather than the usual visual format. To understand 
the modality effect, researchers conducted studies on the 
combination of audio and text in e-book design. For 
example, in Koroghlanian & Sullivan's study (2000), they 


provided students audio and texts with different densities. 
The results showed that there was a significant difference in 
learning efficiency among groups. It showed that the way 
audio and text are combined impacts learning outcomes. 

Most e-books for children have integrated auditory 
information and written text by providing visual markings, or 
the so-called text-tracking function; that is, the printed text 
changes by highlighting and coloring as it is narrated. 
Some e-books allow the reader to follow the text tracking in 
each screen as many times as they like and tracking of the 
text appears in units of sentences, phrases or separate 
words (Korat, 2010). Studies show that young children 
benefit from reading highlighted texts in e-books (De Jong 
& Bus, 2004). For the design of text-tracking, researchers 
claimed that the unit of text tracking may influence 
children's improvement in word reading. For example, 
highlighted text at the word level is thought a better support 
for word reading than sentences or phrase (Korat, 2010). 
However, empirical evidence for this argument is limited, 
especially for elementary learners. 
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Text-tracking design for children is different from that for 
adults. For children who have an undeveloped 
phonological loop, attention control and working memory, 
it is necessary to explore how to equip the e-book textual 
information with audio information. One question emerges 
for e-book designers: what is the best way to design the 
interface in order to present the textual information and 
spoken information together for children? Due to limited 
resources on this issue, this research aims to explore this 
question in order to shed light on children's e-book audio¬ 
visual design. There are several research questions in this 
study: 1) Is there any difference in learning achievement 
between groups with different text-tracking design (word- 
based tracking and line-based tracking)? 2) Is there any 
difference in learning motivation between groups with 
different e-book text-tracking design (word-based tracking 
and line-based tracking)? This study targeted 4th graders as 
a sample. It is hoped that the research can contribute to 
multimedia learning theories, and also help practical 
educators and instructional designers to create better e- 
books. 

Literature Review 

An e-book Is like a traditional storybook in several ways: it 
displays print and has book parts (e.g., table of contents, 
chapters, pages). However, it is also very different from 
traditional books in its multimedia supports (e.g., visual aids, 
auditory aids and animation) (Roskos, Brueck, & Widman, 
2009). An e-book provides a new dimension that paper 
books cannot offer: in addition to reading information, it is 
also possible to listen to spoken information (Wittenman & 
Segers, 2009), Researchers claimed that learning would be 
more effective by synchronizing information in audio and 
visual format rather than the audio or visual information 
alone, and this argument was supported by Baddley's 
working memory model and Mayer's modality principle. 

Baddley's working memory model consists of four distinct 
parts. Three parts, phonological loop, visual-spatial 
sketchpad and episodic buffer, are controlled by a fourth 
part, a supervisor system called the central executive. The 
phonological loop processes auditory information; the 
visuospatial sketchpad takes care of the visual and spatial 
information, and the episodic buffer integrates everything 


and adds time sequencing, The latter part is thought to 
have links to long term memory (Baddley, 2000). Baddley 
conducted empirical studies to provide evidence on the 
existence of phonological and visual-spatial system for 
memory processing, and this is broadly used to improve 
the effectiveness of instructional multimedia. 

The cognitive theory of multimedia learning and the 
modality principle were built up based on the working 
memory model. Similar to the working memory model, 
multimedia learning theorists pose that there are two 
different channels through which information can enter the 
brain: a visual channel and an auditory channel. 
Presenting all the information to one channel will create 
cognitive overload, as working memory is limited. Learners 
retain more information and foster deeper learning when 
information is presented in two channels, and this is called 
the modality effect. Ginns (2005) has performed a meta¬ 
analysis on forty-three independent studies and found 
sufficient support for the modality effect. However, most 
studies invited adults as participants; few considered 
elementary learners as participants. Less is known about 
the modality effect in children. 

The modality effect occurs only when learners are able to 
transfer textual, auditory and imagery information into two 
separate channels and make a connection. However, 
children are underdeveloped in cognitive abilities, and it 
might be difficult for children to integrate different sources of 
multimedia instruction. Researchers argued that there are 
two main reasons why young learners fail to read instructions 
in educational multimedia: 1) inadequate attention control, 
and 2) under-developed phonological loop (Mann, Schulz, 
& Cui, 2011). The first reason that accounts for children's 
learning difficulties with multimedia is their inability to 
adequately control their attention during multimedia 
learning. Unlike most entertainment multimedia, educational 
multimedia requires active listening and reading the 
instructions presented in the program. To learn in the 
multimedia environment, a student must consciously focus 
their attention to bind the separate features of a stimulus- 
such as the color, shape, word- into a unitary object, and 
mentally articulate their own version of the meaning in the 
text. But not all children have sufficient attention control to 
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well-integrate these stimuli. The second reason that accounts 
for children's learning difficulties with multimedia is insufficient 
mental articulation of spoken and screen text due to an 
under-developed phonological loop. The phonological loop 
system mediates the acquisition of syntax knowledge as well 
as the learning of individual words (Baddeley, Gathercole, & 
Papagno, 1998). Phonological loop is critical for on-screen 
text learning. On-screen text is fed into the phonological store 
by means of sub-vocal speech using an articulatory system, 
like an inner voice to an inner ear (Baddeley, Gathercole, & 
Papagno, 1998). For unfamiliar on-screen content, adults are 
able to articulate the sound of the words and hear the inner 
voice with their inner ear. They can quickly encode the 
information directly into their phonological store through their 
articulatory loop, However, young learners are not fully 
capable of mentally articulating on-screen texts. Their 
auditory memory consists of a phonological store without a 
developed phonological loop (Gathercole, Pickering, 
Ambridge & Wearing, 2004). Unarticulated material for 
young children is extraneous cognitive load (Kalyuga, 
Chandler, & Sweller, 1999). 

Due to above reasons, children may not be able to 
synchronize different sources of media and may not be able 
to determine exactly about what the sound content is talking 
within the visual content. In such a case, not only will the 
auditory and visual channels fail to interact, butthe learner will 
also have to use a considerable number of cognitive 
resources to determine where the audio and visual should be 
synchronized. It might be more helpful if strategies were used 
to synchronize two sources of content. Several approaches 
were presented to enhance efficiency of children's 
integration of multimedia while reading e-books. The first 
approach is the provision of synchronized pointer. Presenting 
a pointer synchronized with the audio and visual content 
controls the learner's point of fixation and thereby 
synchronizes the content temporally and spatially (Ando & 
Ueno, 2008). Studies have shown that learners' acquisition of 
deep knowledge is facilitated when a pointer is used in the 
presentation of multimedia content (Ando & Ueno, 2008). 
The other approach is the provision of visual marking. Visual 
marking, or the so-called text-tracking design, is one 
approach to synchronize different sources of media and 
improve children's attention for learning. It refers to printed 


text that changes by highlighting and coloring as it is 
narrated. The visual marking can be at the lower level or the 
higher level. Visual marking at a lower level emphasizes the 
word that is being narrated, and at a higher level it 
emphasizes the sentences, paragraphs, or sections that are 
being narrated (Duarte & Carrico, 2004). Taking e-books as 
an example, visual marking is one popular dynamic option. 
Some e-books allow the reader to follow the text tracking in 
each screen as many times as they like and tracking of text 
appears in units of sentence, phrases, or separate words, 
According to Ehri and Sweet (1991), children's orthographic 
knowledge might be supported by pointing to the text while 
reading, One focus of usability problems exist related to the 
synchronization of the content: the synchronized visual 
should guide the user to the text being narrated without 
distracting him or her from reading (Duarte & Carrico, 2004). 
For the design of text-tracking, researchers claimed that the 
unit of text tracking may influence children's improvement in 
word reading. For example, highlighted text at the word level 
is thought a better support for word reading than sentences or 
phrases (Korat, 2010). However, empirical evidence for this 
argument are limited, especially for elementary learners. 
This study consequently aims to explore the impact of the 
text-tracking unit on learning and to understand which text 
tracking unit is better in children's e-book design. 

Design And Development Of An E-book For Chinese 
Character Learning 
Learning objectives 

A twenty-six-page e-book program was created to 
facilitate Chinese character learning. We picked up five 
characters from one popular fourth-grade textbook for 
Chinese language learning in Taiwan. The five characters 
were composed of similar phonological components, and 
we composed short passages with sentences using these 
five characters. Through reading of the e-book, students 
were expected to 1) write correctly the five characters 
(Chinese character writing), 2) understand the meaning of 
the five characters (lexical comprehension), and 3) use 
appropriately the five characters (lexical usage). 

Program design and development 

Adobe Flash CS5.5 Professional and Action Script 3.0 were 
used to create the e-book. The program was designed with 
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a 841 px * 595px window (A4 size). The structure of the e- 
book was as follows: 1) opening animation, 2) cover page, 
3) table of contents, 4) guidance to kids, 5) story, 6) review 
for new characters, 7) animation for characters, and 8) end 
of the story. This study created two versions of the e-book 
with different text-tracking designs; one is the word-based 
text-tracking and the other is line-based text-tracking. 

Word-based text-tracking 

The word-based version was designed by taking into 
consideration that word level exposure by highlighting will 
support better readability for children: each word in the text 
was colored stimulatingly with the narrators' reading. Figure 1 
is a sample page of the textbook for the word-based tracking 
e-book. 

Line-based text-tracking 

The line-based version was designed by taking into 
consideration that line level exposure, and by highlighting it 
will support better readability for children: each line in the text 
was colored stimulatingly with the narrators' reading. Figure 2 
is a sample page of the textbook for the line-based tracking 
e-book. 

Method 

Research questions 

This research aimed to understand how e-book text¬ 
tracking influences outcomes in Chinese character 
learning. The independent variable was e-book text¬ 
tracking divided as two levels: 1) word-based text-tracking, 
and 2) line-based text-tracking. The dependent variables 
include: 1) learning achievement (character writing, lexical 



Figure 1. The snapshot for word-based text-tracking design 



Figure 2. The snapshot for line-based text-tracking design 


comprehension, and lexical usage), and 2) learning 
motivation (attention, relevance, confidence, and 
satisfaction). The research questions in this study were: 1) Is 
there any difference in learning achievement (Chinese 
character writing, lexical comprehension, and lexical 
usage) between groups who read e-books with different 
text-tracking designs? 2) Is there any difference in learning 
motivation (attention, confidence, relevance, satisfaction) 
between groups who read e-books with different text¬ 
tracking designs? 

Participants 

The data collection was conducted in the 2012 fall 
semester. Forty-nine fourth-grade (10-year-old) students 
were recruited from one elementary school in southern 
Taiwan. These participants were from three different classes 
but were randomly assigned to one of the two treatment 
groups. 

Treatment 

As mentioned above, there were two versions of the e-book 
with different text-tracking: 1) word-based text-tracking, 
and 2) line-based text-tracking. Students were randomly 
assigned into one group with a particular e-book version for 
all tasks. 

Experimental procedures 

All students had a 10-minute pre-test composed of 10 
questions on characters, including 5 fill-in-the-blank 
questions and 5 multiple-choice questions. Then students 
were assigned into different treatment groups for reading 
the e-book. It took 30-40 minutes for students to finish 
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reading. After students finished reading, they were asked to 
do a post test, in addition, information about student 
motivation (attention, relevance, confidence, and 
satisfaction) was also collected. 

Instruments 

The instruments used in the study included 1) the 
achievement pre-test, 2) the achievement post-test, and 3) 
learning motivation survey. The achievement tests 
(consisting of the pre-tests and post-test) were designed 
with twenty items, ten items for character writing, five items 
for lexical comprehension, and five items for lexical usage. 
The character writing questions were composed of fill-in- 
the-blank items, and the lexical comprehension and 
usage questions were composed of multiple choice items. 
All questions aimed to test students' knowledge of the five 
Chinese characters. The pre-test and post-test were 
composed of similar questions, comparable in difficulty 
and format. Consequently, on both tests, each student had 
three separate scores for achievement. These test 
questions were verified with two experienced elementary 
school teachers for expert validity. The learning motivation 
survey used in the study was the Instructional Material 
Motivational Survey (IMMS) designed by Keller (2010). This 
survey was designed to measure reactions to self-directed 
instructional materials, This five-point scale survey was 
based on ARCS model with four indicators: Attention, 
Relevance, Confidence, and Satisfaction. This survey can 
be used with younger students who have the appropriate 
reading level. The validity for this survey was proven, and the 
Cronbach's alpha coefficients for factors of this survey 
ranged from .81 to .92 (Keller, 2010). 

Results 

Learners' pre-test scores were first examined to see if there 
was any prior-knowledge difference between groups. The 
results showed that no difference existed between groups. 
That is, students' prior knowledge to these characters was 
equal among two groups. Then ANOVA procedures were 
conducted for students' achievement scores and 
motivation scores, The descriptive statistics and ANOVA 
results on learning outcomes are presented in Table 1. 
Students in the line-based text-tracking group performed 
better in character writing [F= 4.60, p=0.4) and lexical 


comprehension (F=9.18, p=.00). There were no significant 
differences in terms of lexical usage between groups. 

The descriptive statistics and ANOVA results on learning 
motivation are presented in Table 2. There were no 
significant differences in all aspects of motivation. 
Discussion 

This study examined the impact of e-book text-tracking 
design on 4th graders' (10-year-old children's) learning of 
Chinese character. The treatment in this study is the text¬ 
tracking design: word-based text-tracking and line-based 
text-tracking. The findings show that the line-based text¬ 
tracking e-book worked better on enhancing character 
writing and lexical comprehension. The superiority of line- 
based text-tracking design may be due to several reasons. 
First, word-based text-tracking design may generate more 
cognitive load to learners than the line-based tracking did. 
In the word-based text-tracking design, since the audio 
and the text were presented simultaneously, learner had to 
listen to each word and watch it at the same time. Students 
may have spent ail their time catching sounds and words, 
and understand nothing of the delivered content. In the 
line-based text-tracking design, students did not have to 
catch texts word by word. They could read the words at their 


Dependent 

variable 

Group 

N 

Mean 

S.D. 

F 

P 

Character 

Word-based text-tracking 

25 

5.68 

3.57 

4.60 

.04* 

writing 

Line-based text-tracking 

24 

7.42 

1.77 



Lexical 

Word-based text-tracking 

25 

6.88 

3.37 

9.18 

.00* 

comprehension 

Line-based text-tracking 

24 

9.17 

1.55 



lexical usage 

Word-based text-tracking 

25 

5.12 

4.13 

1.08 

.30 


Line-based text-tracking 

24 

6.17 

2.76 



><.05 







Table 1 

. Descriptive statistics and ANOVA results for 



learners' achievement scores 




Dependent 

variable 

Group 

N 

Mean 

S.D. 

F 

P 

Attention 

Word-based text-tracking 

25 

3.29 

.81 

1.08 

.29 


Line-based text-tracking 

23 

3.52 

.69 



Relevance 

Word-based text-tracking 

25 

3.37 

.86 

1.12 

.27 


Line-based text-tracking 

23 

3.65 

.86 



Confidence 

Word-based text-tracking 

25 

3.42 

.53 

.60 

.55 


Line-based text-tracking 

23 

3.51 

.56 



Satisfaction 

Word-based text-tracking 

25 

3.30 

1.03 

1.15 

.26 


Line-based text-tracking 

23 

3.66 

1.15 




><.05 

Table 2, Descriptive statistics and AVOVA results for 
learners' motivation scores 
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own pace, and the visual marking became a reminder to 
tell learners where the narrator was. Moreover, in the line- 
based text-tracking design, it was possible for learners to 
rely on only visual information (or auditory information) since 
the narration appeared after the line turn colored. Since 
the text and audio are not shown at the exact same time, 
learners may have taken advantage of only one of the 
channels. For young learners who are cognitively 
underdeveloped, this process may have resulted in 
insufficient mental articulation during listening and reading. 
Learning with only one channel may cause less failure 
compared with learning with connections between the 
auditory and visual channel (Mann, Newhouse, Pagram, 
Campbell & Schulz, 2002). In addition, the highlighted 
speed in the word-based design may not have matched 
learners' reading speeds, and this could reduce the 
modality effect. The results might have been different if we 
had slowed down or sped up the narrator's speed in the 
word-based text-tracking book. 

In terms of the motivation scores, researchers argue that 
spoken information provided with highlights or details about 
visuals would better increase student learning motivation, 
especially in attention focus (Mann, Newhouse, Pagram, 
Campbell, & Schulz, 2002). The motivation survey in this 
study shows that the mean scores for students in both two 
groups were above average, which means all students 
had a positive viewpoint on reading e-books with text¬ 
tracking functions. Student motivation scores in the line- 
based text-tracking group are all higher than those in the 
word-based text-tracking group; however, the difference 
did not reach the significant level. This indicates that two 
text-tracking approaches might not be different in terms of 
improving learning motivation, and what is the best 
approach to present narrations and texts together for 
improving learning motivation still needs to be explored. For 
example, other text-tracking approaches such as 
sentence-based tracking, section-based tracking and 
narration pointer should be designed and examined. 

Last, there are different ways to design the auditory 
information for better learning. Based on Mann's Structured 
Sound Function (SSF) Model, there are five possible 
functions for conceptualizing auditory information: 1) 


temporal sound (temporal speech cueing), 2) point of view 
sound, 3) locale sound, 4) atmosphere sound, and 5) 
character sound. These different types of sounds fulfill 
different purposes in learning (Mann, 2008). Auditory 
information can serve as supplemental or reading context 
for young learners that have under-developed cognitive 
abilities. The auditory information provided in this study is 
temporal speech cueing, defined as spoken information 
provided about future or post events highlighting or 
detailing about static or moving visuals. Examples of 
temporal sound include instruction, navigational direction, 
hinting, feedback and reminders (Mann, 2008), This study 
only focused on the impact of narration with texts, and did 
not provide a comprehensive result due to the small scale 
of this study. More research needs to be done to examine 
the impact of different designs of auditory information on 
young learners, and this is the future direction of this study. 

Conclusions 

This study examined the impact of e-book text-tracking 
design on 4th graders' (10-year-old children's) learning of 
Chinese characters. Students were divided into two groups 
and provided with different books: 1) a word-based text¬ 
tracking version, or 2) line-based text-tracking version. The 
results show that students who used the line-based text¬ 
tracking version e-books performed significantly better in 
character writing and lexical comprehension. 

Several limitations need to be acknowledged in the 
interpretation of these results. First of all, this research was 
conducted in a laboratory setting rather than in a real 
classroom situation; thus, students may have had different 
motivations and may have exhibited different behaviors 
than those experienced during an actual class. Second, 
this research was restricted to the learning of five characters 
for elementary students. Students may have had different 
responses if this particular unit and subject had been 
something else. Due to these limitations, the generalization 
of this study might be conservative. Though limitations exist, 
this study aims to give rise to more empirical studies by 
sharing the experiences of developing e-book text-tracking 
for elementary learning. 

We will continue to conduct studies in the following 
directions: 1) exploring the impact of different audio types 
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(functions) on learning, 2) using an eye-tracking system to 
understand learners' visual attention while listening to 
audio, 3) understanding the role of narration pace on the 
modality effect, 4) confirming the modality effect on young 
learners aged 7 to 12. In addition, we will keep improving 
our e-book design and explore its effects in the future. The 
study had limited participants and a short duration; the 
results might have bias and cannot be generalized 
broadly. Further study with a larger sample size and diverse 
treatment groups will be conducted soon for providing 
better evidence. 
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